Optimising selection of units from speech databases for concatenative synthesis
نویسندگان
چکیده
Concatenating units of natural speech is one method of speech synthesis. Most such systems use an inventory of xed length units, typically diphones or triphones with one instance of each type. An alternative is to use more varied, non-uniform units extracted from large speech databases containing multiple instances of each. The greater variability in such natural speech segments allows closer modeling of naturalness and di erences in speaking styles, and eliminates the need for specially-recorded, single-use databases. However, with the greater variability comes the problem of how to select between the many instances of units in the database. This paper addresses that issue and presents a general method for unit selection.
منابع مشابه
Databases of Heterogeneous Segments for Concatenative Speech Synthesis
Heterogeneous segments can enhance the quality of concatenative speech synthesis especially for highly inflected languages. In this paper we present a brief analysis of the segment types on a general level and discuss the problems related to optimising databases of heterogeneous segments. We present a brief discussion of the algorithmical complexity for the proposed approach and offer some heur...
متن کاملمراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی
Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...
متن کاملA System for Data-driven Concatenative Sound Synthesis
In speech synthesis, concatenative data-driven synthesis methods prevail. They use a database of recorded speech and a unit selection algorithm that selects the segments that match best the utterance to be synthesized. Transferring these ideas to musical sound synthesis allows a new method of high quality sound synthesis. Usual synthesis methods are based on a model of the sound signal. It is v...
متن کاملDiphone synthesis using unit selection
This paper describes an experimental AT&T concatenative synthesis system using unit selection, for which the basic synthesis units are diphones. The synthesizer may use any of the data from a large database of utterances. Since there are in general multiple instances of each concatenative unit, the system performs dynamic unit selection. Selection among candidates is done dynamically at synthes...
متن کاملACTOR: A multilingual unit-selection speech synthesis system
The ACTOR® Text-To-Speech (TTS) synthesis system, developed at Loquendo S.p.A., is here described. The system employs a unit -selection concatenative synthesis technique, relying on labeled acoustic databases providing phonetic and prosodic coverage of the intended language/domain and on an original algorithm for run-time selection of the acoustic units to be concatenated. This technique yields...
متن کامل